Pinch Ratio Clustering from a Topologically Intrinsic Lexicographic Ordering

نویسندگان

  • Douglas R. Heisterkamp
  • Jesse Johnson
چکیده

This paper introduces an algorithm for determining data clusters called TILO/PRC (Topologically Intrinsic Lexicographic Ordering/Pinch Ratio Clustering). The theoretical foundation for this algorithm, developed in [14], uses ideas from topology (particularly knot theory) suggesting that it should be very flexible and robust with respect to noise. The TILO portion of the algorithm progressively improves a linear ordering of the points in a data set until the ordering satisfies a topological condition called strongly irreducible. The PRC algorithm then divides the data set based on this ordering and a heuristic metric called the pinch ratio. We demonstrate the effectiveness of TILO/PRC for finding clusters in a wide variety of real and synthetic data sets and compare the results to existing clustering methods. Moreover, because the output of TILO depends on the initial ordering, we consider the effects of different random orderings on the final clusters defined by PRC, and show that choosing an initial ordering based on a different clustering algorithm can improve the final clusters. These results verify that both the theoretical foundations of TILO and the heuristic notion of pinch ratio are reasonable.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Repeated Record Ordering for Constrained Size Clustering

One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...

متن کامل

The Light Lexicographic path Ordering

We introduce syntactic restrictions of the lexicographic path ordering to obtain the Light Lexicographic Path Ordering. We show that the light lexicographic path ordering leads to a characterisation of the functions computable in space bounded by a polynomial in the size of the inputs.

متن کامل

Permuting Web Graphs

Since the first investigations on web graph compression, it has been clear that the ordering of the nodes of the graph has a fundamental influence on the compression rate (usually expressed as the number of bits per link). The author of the LINK database [1], for instance, investigated three different approaches: an extrinsic ordering (URL ordering) and two intrinsic (or coordinate-free) orderi...

متن کامل

A revisit of a mathematical model for solving fully fuzzy linear programming problem with trapezoidal fuzzy numbers

In this paper fully fuzzy linear programming (FFLP) problem with both equality and inequality constraints is considered where all the parameters and decision variables are represented by non-negative trapezoidal fuzzy numbers. According to the current approach, the FFLP problem with equality constraints first is converted into a multi–objective linear programming (MOLP) problem with crisp const...

متن کامل

Lexicographical ordering by spectral moments of trees with a given bipartition

 Lexicographic ordering by spectral moments ($S$-order) among all trees is discussed in this‎ ‎paper‎. ‎For two given positive integers $p$ and $q$ with $pleqslant q$‎, ‎we denote $mathscr{T}_n^{p‎, ‎q}={T‎: ‎T$ is a tree of order $n$ with a $(p‎, ‎q)$-bipartition}‎. Furthermore, ‎the last four trees‎, ‎in the $S$-order‎, ‎among $mathscr{T}_n^{p‎, ‎q},(4leqslant pleqslant q)$ are characterized‎.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013